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Abstract. Let p > 5 be a prime. We show that the largest subset of F™ with no 4- 
term arithmetic progressions has cardinality O p (N(\og N)~ 2 22 ), where N := |F p |™ = 
, p n . A result of this type was claimed in our previous paper, but the proof had a gap 
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(and we issue an erratum for that paper here). We give here a different and significantly 
shorter argument that yields the same bound. In fact we prove a stronger result, which 
can be viewed as a quantatitive version of some previous results of Bergelson-Host-Kra 
and the authors. 



1. Introduction 

Szemeredi's theorem [H] asserts that any set of integers with positive upper density 
contains arbitrarily long arithmetic progressions. This is easily seen to be equivalent to 
the assertion that rk{N) = Ok(N) for all k ^ 3, where r&(iV) denotes the cardinality 
of the largest subset of [N] = {1, . . . ,N} containing no fc-term arithmetic progression 
with distinct terms, and Ok(N) denotes a quantity which, when divided by N, goes to 
zero as N — y oo for each fixed k. 

Much attention has been devoted to the question of finding bounds for rjk(iV). The 
current state of the art is as follows: 

(i) Sanders [TO] showed in 2010 that r 3 (N) < JV(log A0~ 1+o(1) ; 



(ii) The authors [TO] showed in 2005 that r A (N) < A^e- c ^ loglogJV ; 

(iii) Gowers (3] showed in 1998 that r^(A^) A r (loglog N)~ Ck for every k ^ 5. 

We omit a detailed discussion of the history of the problem, referring the reader to the 
three papers cited above. 

In studying these problems a great deal of mileage has been gained from studying 
what are known as finite field models. Instead of r^N) one considers rk(F n ), where F 
is a finite field. The quantity rk(F n ) is defined to be the cardinality of the largest subset 
of the vector space F n containing no fc-term arithmetic progression with distinct terms. 
In order that a /c-term arithmetic progression not be degenerate, we must assume that 
F has characteristic greater than k, and we assume that F = ¥ p is a prime field for 
notational simplicity. When k = 3 one traditionally takes F = F3, and for the purposes 
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of this paper, where our main interest lies in the case k — 4, the reader will lose little 
by taking F = F 5 . See [B] for a general discussion of the role of finite field models in 
additive combinatorics. 

Write N := \F n \. Then the current state of the art for this question is as follows: 

(i) Bateman and Katz [I] showed in 2011 that r 3 (F n ) <Cf iV(log A^) _1 ~ c for some 
absolute constant c > 0; 

(ii) The authors [8J showed in 2005 that r 4 (F n ) < iV(loglog N)~ CF ; 

(iii) The authors [9] in 2009 improved this bound to r 4 (F n ) < F Ne -c F V^sW^_ We 
also claimed the improved bound r4(F n ) <C^ iV(log N)~ CF . 

(iv) It is known, for instance by using the density Hales- Jewett theorem [3], that 
Tk{F n ) = Ok t F(N) for all k ^ 5, assuming of course that F has characteristic at 
least k. 

Recently, we discovered that our argument in [9] claiming the bound r±(F n ) <Cf 
N(logN)~ CF contains a gap, the nature of which is described in Appendix |Aj (The 
"cheap" bound r4(F n ) <Cf Ne~ CF ^ loglogN established in that paper is however not 
subject to this problem, nor is the analogous bound for r 4 (iV) established in [10] by 
similar methods.) The main purpose of this paper is to provide an alternate, simpler, 
and - most importantly - correct argument that recovers this bound. In fact, we obtain 
the following stronger statement. By an affine subspace of F n we mean a coset of a 
linear subspace W of F n . 

Theorem 1.1. Let F = ¥ p be a finite field with p ^ 5. Let n £ N, let < a, e ^ 1, and 

let A be a subset of F n of density at least a. Then there exists an affine subspace W of 
F n of codimension at most Cfs~ 22 ° with the property that 

\{(x,r) e W x W : x,x + r,x + 2r,x + 3r G A}\ ^ (a 4 -e)\W\ 2 , 

where Cf > depends only on F. 

A qualitative variant of this theorem already appeared (as a joint result of the authors 
of the present paper) in [7J Theorem 4.1], which in turn was inspired by an ergodic 
theoretic result of Bergelson, Host, and Kra [2]; see also [Tl] Theorem 1.12] for another 
related result. Note that the quantity a 4 !^ 2 is the natural quantity associated to the 
statistic \{(x, r) £ W x W : x, x + r, x + 2r, x + 3r £ A}\, as if A were a random 
subset of F n with density a, then the expected value of this statistic would indeed be 
a 4 1 W | 2 . The exponent 2 20 is certainly not best possible, and is mostly dependent on the 
exponent 2 16 appearing in the inverse theorem for the U 3 norm in [8]; any improvement 
on the exponents in the latter result would lead to improvements in the exponents here. 

As an immediate corollary of the above theorem, we recover the main result claimed 
in p. 
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Corollary 1.2. Let F = ¥ p be a finite field with p ^ 5. Let n 6 N, and write N := \F n \. 
Then r A (F n ) < F N(\og N)~ 2 ' 22 . 

Proof. Let A be a subset of F n with no length 4 progressions and cardinality r 4 (F n ), 
and set a := |A|/|F n | = r±(F n )/N. By Theorem 11.11 with e = a /2 (say), we can find 
an affine subspace W of F n of codimension at most CfoT 2 ^ for some Cf > depending 
only on F, such that 

\{{x,r) eW xW : x,x + r,x + 2r,x + 3r G A}| ^ i« 4 |P^| 2 . 

On the other hand, as A has no length 4 progressions, the left-hand side is at most \W\. 
We conclude that \W\ ^ 2/a 4 which, when combined with the codimension bound on 
W, implies that 

n ^C F a +logi F i — < F a 
1 1 a 4 

This gives a >i? (logiV) -2 22 , and the claim follows. □ 



2. Notation and an outline of the argument 

Throughout this paper the field F is fixed, and all constants are permitted to depend 
on F. As such we will no longer explicitly subscript these constants by F, for instance 
abbreviating cp as c. 

For technical reasons it is convenient to replace the vector space F n by the more 
general concept of an affine space, by which we mean a coset W = x + W of a linear 
subspace W of some ambient vector space F n , where x is also an element of F n . We 
will often refer to W without any explicit mention of the underlying space F n . The 
dimension of W, dim(W), is defined to be dimfW). If W is an affine space which is 
contained in another affine space W, we call W' an affine subspace of W, and define 
the codimension of W inside W to be dim(W / ) — dimfW'). 

Our argument is similar to that in [7] or [TT], but with more attention paid to the 
quantitative estimates. The main step in our argument will be what we call a local 
Koopman-von Neumann theorem, the detailed statement of which is Theorem 14.101 
Roughly speaking, this theorem asserts that if A is a subset of some affine space W of 
some density a, then we can find an an affine subspace W of W of large codimension 
on which A can be approximated (in the sense of the Gowers U 3 (W) norm) by a 
"quadratically structured" function /, that is to say a function of a bounded number of 
quadratic polynomials on W. Furthermore we may ensure that the density of A on W 
is basically at least as large as a, and crucially we may also ensure that the quadratic 
polynomials involved in the construction of / obey a "high rank" condition, in the sense 
that any non-trivial linear combination of these polynomials has high rank. The most 
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important ingredient in the proof of Theorem 14. 101 is the inverse theorem for the Gowers 
[7 3 -norm in finite fields [8]. 

Once Theorem 14.101 is proven it follows from the theory of Gowers norms that the 
count of 4-term arithmetic progressions of A in W is very close to the corresponding 
count of 4-term arithmetic progressions weighted by /. On the other hand, by invoking 
a "counting lemma" we will be able to obtain an accurate and explicit Fourier- analytic 
formula for the number of 4-term arithmetic progressions weighted by /. It turns out 
that there is a useful positivity property in this formula, essentially first observed in [2] 
in a slightly different context, which allows one to give a lower bound for this count of 
essentially a 4 . This gives the main theorem. 

The paper is organised as follows. In §3] we define the Gowers [7 3 -norm and prove 
some simple facts relating it to 4-term progressions. Section §2]is the heart of the paper: 
here we prove the local Koopman-von Neumann theorem, Theorem 14.101 Section §|5] is 
concerned with analysing quadratically structured functions, and in particular with 
counting 4-term progressions weighted by them. From this, the main theorem is easily 
established. 

Notation. Our notation is standard in additive combinatorics. We draw the reader's 
attention to our use of E xeX f(x) to denote the average of / over the (finite) set X. We 
write ||/|Ui ( x) := Kex\f(x)\ and \\f\\ LH x) := (E xeX \f(x)\ 2 y/ 2 . We use the letter C to 
denote an absolute constant; it need not be the same at every occurrence. When we 

want to emphasise different constants we use subscripts and refer to Co, Ci, C2, In 

this paper, each constant C could be specified explicitly if desired. We use X C F or 
X = 0(Y) to denote the bound \X\ ^ CY for some constant C. 

3. Progression of length 4 and the U 3 norm 

Recall from the previous section the notion of an affine space W with associated linear 
space W. 

Let W be an affine space over F. If /(b/1,/2,/3 : W — > K are functions then we 
define 

T w (fo, fi, h, h) ■= E xew,hewfa( x )fi( x + d)f 2 {x + 2d)f 3 (x + 3d), 
a normalised count of the 4-term arithmetic progressions in W weighted by the functions 
fo, fx-, f2 and / 3 . In the special case in which all the fi are equal to some function / 
then we will write 

T w (f) :=T w (f,f,f,f). 

We record a bound for TV in terms of the Gowers £7 3 -norm, a result of a type known as 
a generalized von Neumann theorem. For a much lengthier introduction to the Gowers 
£/ 3 -norm, see [S]. If / : W — >■ C is a function, we define ||/||[/ 3 (w) to be the unique 
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non-negative real number such that 

WfWuHw) '■= E xeWMMMew(f(x)f(x + hi)f(x + h 2 )f{x + h 3 )f(x + h x + h 2 ) x 

This is the standard definition, modified slightly so that it applies to affine spaces as well 
as linear ones. It can be shown that the quantity on the right is real and non-negative, 
so ||/||[/3(vi/) is well-defined. It can also be shown that || ■ ||t/3(vi/) defines a norm, but we 
shall not need this fact in this paper. 

The next lemma is of a type referred to in the literature as a Generalised Von Neu- 
mann Theorem. 

Lemma 3.1. Let W be an affine space and suppose that f , fi, f 2 , fs : W — >• C are 
bounded in magnitude by 1. Then we have 

\T w (fo, fi, f-2, fs)\ ^ minjfi\\ u3{w) . 

Proof. This is jH Proposition 1.7], and is proved in §4 of that paper using three 
applications of the Cauchy-Schwarz inequality. Versions of this inequality appear in 
several earlier works also, such as [4J. The extension to affine spaces W is trivial and 
left to the reader. □ 
Using the telescoping identity 

Tw(f)-T w (g) = T w (f-g, g, g, g)+T w (f, f-g, g, g)+T w (f, f, f-g, g)+T w (f } f, f, f-g), 

we conclude the following bound. 

Lemma 3.2. Let f,g:W-^Cbe functions on an affine space W bounded in magnitude 
by 1. Then we have 

\Tw(f) - T w (g)\ ^ M\f -g\\us(W)- 

4. Factors and Quadratically structured functions 

In this section we develop the language and tools needed to discuss the "quadratically 
structured functions" mentioned in §|2j 

Definition 4.1 (Factors). If W is a finite set then by a factor B we mean simply a 
partition of W into finitely many pieces which, in this paper, we refer to as atoms. 

Remark. The nomenclature hints at connections with ergodic theory which in some 
sense inspire some of the arguments of this paper. We say that a function : W — > C 
is S-measurable if it is constant on atoms of B. 

If / : W — > C is any function then we may define the conditional expectation 

E(f\B)(x) := E B[x) f for all x e W, 
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where B(x) is the unique atom in B that contains x. Equivalently, E(/|£>) is the or- 
thogonal projection (in the Hilbert space L 2 (W)) to the space £>-measurable functions. 

Suppose that we are given a finite collection 1; . . . , (fid of functions from X to some 
other set Y. Then these may be used to define a factor B = B^ lt ... t ^ d in a natural way 
by taking the atoms of B to consist of sets of the form {x G X : <f>i(x) = yi, . . . , = 
4>d(x) = yd}- For factors defined in this way we refer to d as (an upper bound for) the 
complexity of the factor B. 

We say that a factor B' is a refinement of B if every atom in B is a union of atoms in B'. 
We will also need the notion of the join BVB' of two factors, which is simply the factor 
formed by intersecting the atoms of B with those of B' (or equivalently, the minimal 
factor that refines both B and £>'). Note that B$ Xt _^ d V £y it ,..^/ / = B ( j )1> ___^ d ^ ...^ d , for 
any functions 0j, 

Definition 4.2 (Quadratic functions). Suppose that is a linear space. By choosing 
a basis for W we may identify it with F n for some n. By a quadratic function on we 
mean a function <fi : W — > F of the form 0(x) = x T Mx + r T a; + c, where M is an n x n 
symmetric matrix over F, r G -F™, and c G F. By the ranA; of we understand the rank 
of the matrix M. More generally, if W = W + w is an affine space then <fr : W F 
is a quadratic function if the function cf) : W — > F defined by <fr(x) := <p(x + w) is a 
quadratic function on W 7 . We define the rank of 4> to be the rank of <fi. 

Definition 4.3 (Quadratic factor). If X = W is an affine space and the <j>i are all 
quadratic functions then we refer to B = B^_^ d as a quadratic factor. 

We will be mostly interested in quadratic factors with a particularly pleasant property. 

Definition 4.4 (Quadratic factors and rank). Let W be an affine space. Then by a 
quadratic factor of rank at least r and complexity d we mean a factor B = B^ u ...^ d 
defined by quadratic functions 0i, . . . , (fid '■ W — > F which satisfy the rank separation 
condition rank(Ai0i + ■ ■ • + A^^) ^ r whenever Ai, . . . , A<j are elements of F, not all 
zero. 

The utility of the rank separation condition will become clear as we proceed, and 
is particularly clearly illustrated by Lemma 15.2} where it is shown that all atoms of B 
have roughly the same size if one assumes this condition. One may also count arithmetic 
progressions across atoms of a high-rank quadratic factor: see Lemma loTSl Some related 
use of high rank quadratic factors and functions occur in [SI dH US] ■ 

For technical reasons we will need to "localise" quadratic factors to certain subspaces. 
This requires some additional definitions. 

Definition 4.5 (Local factors). Let W be an affine space. By a local factor of codi- 
mension at most D we mean a factor B\ of W whose atoms are all affine subspaces of 
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W of codimension at most D; note that we allow these subspaces to have different ori- 
entations (and even different codimensions). By a local quadratic factor of codimension 
at most D, rank at least r, and complexity at most d we mean a pair (£>i, B 2 ) of factors, 
where B\ is a local factor on W of codimension at least D, and B 2 is an extension of B\ 
with the property that on each atom W of B±, the restriction B 2 [w of B 2 to W is a 
quadratic factor on W of rank at most r and complexity at most d. 

We say that a local quadratic factor (B[, B' 2 ) is a refinement of another local quadratic 
factor (Bi, B 2 ) if B[ is a refinement of B\ and £>2 is a refinement of B 2 . 

Some facts about factors. In this subsection we collect together some lemmas about 
factors, and quadratic factors in particular. 

Lemma 4.6. Suppose that X is a finite set and that f : X — > C is a function. Suppose 
that B and B' are two factors, with B' a refinement of B. Then 

||E(/|£')|U 2( x) > ||E(/|B)|| L2(X) . 

Proof. We have E{E{f\B')\B) = E{f\B), and so E(f\B) is the orthogonal projection 
of M(f\B') (in L 2 (X)) to the space of immeasurable functions. In particular E(f\B') — 
M(f\B) is orthogonal to E(/|B), and so Pythagoras' theorem yields 

m(f\B')\\i HX) = wmmhix) + imfw-mmbm > iwi^iiw 

This concludes the proof. □ 

We shall refer to ||E(/|B)||^ 2 (x) as the ener 9y °f / relative to the factor B. Note that 
if / is bounded (by 1) then the energy lies in the interval [0,1]. 

The following lemma, which shows how to make a quadratic factor high-rank, is 
crucial. 

Lemma 4.7. Suppose that W is an affine space and that B is a quadratic factor of 
complexity at most d onW . Then there is a local quadratic factor (B±, B 2 ) of codimen- 
sion at most dr + d 2 + d, rank at least r , and complexity at most d, such that B 2 is a 
refinement of B. 

Proof Suppose that B is defined by quadratic forms 0i, . . . , 4>a- If, f° r every choice of 
Ax, . . . , Ad G F, not all zero, we have the high-rank condition rank(Ai0i + • • • + Xd4>d) ^ 
r + d, then the result is immediate (with B' := B). Otherwise, we may rescale and 
relabel so that, without loss of generality, Ad = 1. Consider the homogeneous linear 
space W. The fact that this rank is at most r means that the kernel of Ax</>x + • • • + Ad</>d, 
W say, has codimension at most r. Restricted to this kernel, <pd is a linear combination 

Of <p u . . . ,0d-l- 

If now rank(Ai0i + • • • + Xd-i4>d-i) ^ r + d then stop; otherwise, continue this rank 
reduction process. It clearly lasts at most d steps, at which point (after relabelling) we 



8 



BEN GREEN AND TERENCE TAO 



have a subspace W ^ W of co dimension at most d(r + d) and some d', ^ d' ^ d, such 
that, restricted to W, each of 0i, . . . , (fid is a linear combination of <j>i, ... ,</>#. 

This means that, restricted to any coset V of W in W, the factor B [y has as 
a refinement a factor cut out by the d' quadratics (fix, . . . ,(fid>, which satisfy a rank 
condition with parameter r + d, as well as up to d linear phases. The affine subspaces cut 
out by these linear phases, over all cosets V, then forms a local factor B\ of codimension 
at most d{r + d) + d. 

Restricted to an atom W of the local factor Bi, the quadratics (fix, . . . , 4>d' still satisfy 
a rank condition with parameter r. Take B 2 ■= BvB%, then (B%, B 2 ) is a local quadratic 
factor of codimension at most dr + d 2 + d, rank at least r, and complexity at most d as 
desired. □ 

We have studied the properties of quadratic factors, but we have yet to say why they 
are useful. The next result, an inverse theorem for the U 3 (W)-norm, is the key input in 
this regard. Here, e F '■ F — > C is defined by e F (x) = e 2nix / p , where F = ¥ p is identified 
with Z/pZ. 

Theorem 4.8. Let W be a linear space over F, and let f : W — > C be a bounded 
function such that \\f\\u s (w) ^ V f or some < 77 ^ |. Then there is a linear subspace 
W ^ W of codimension at most 0(i]~ 2l( ') such that, for each coset W + t of W in W , 
there exists a quadratic phase function <f> t : W + t —> F such that 

\E xeW ,f(x)e F {-<fi t (x))\ > r, 2 . (4.1) 
Proof. See [SI Theorem 2.3]. □ 

We have the following corollary of this in the language of factors. 

Corollary 4.9 (Inverse theorem for U 3 , corollary). Let W be an affine space and suppose 
that f : W — > C is a bounded function such that \\f\\u 3 (w) ^ V; where < rj ^ \. Then 
there is a local quadratic factor (£> 1; B 2 ) of codimension 0(i]~ 2l( ') and complexity at most 
1 such that \\E(f\B 2 )\\ L 2 {w) > rf\ 

Proof. Without loss of generality we may take W to be a linear space. Let W and 
the (fit be as in Theorem I4.8[ and let B\ be the local factor generated by the cosets of 
W, thus B\ has codimension 0(r]~ 216 ). Let B 2 be the factor whose atoms are of the 
form {x G W + 1 : (fi t (x) = a} for various t G W/W and a G F: then {Bi, B 2 ) is a local 
quadratic factor of codimension 0(r]~ 2l(i ) and complexity at most 1. Observe that the 
left-hand side of (14.1 1) can be rewritten as 

\E t&w / wl \E X £ W ,E(f\B 2 )(x)e F (-(fit{x))\, 

which, by the Cauchy-Schwarz inequality, is bounded by ||E(/|^ 2 )||l 2 (iv)- The claim 
follows. □ 
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Theorem 4.10 (Local Koopman-von Neumann). Let A be a set with density a, 

< a ^ 1, on some affine space W . Let < rj,e < |, and suppose that r ^ 1. T/jen 
there is an affine subspace W C W of codimension 0(s~ 3 r]~ 219 r) such that the density 
of A on W is at least a — e, and such that there is a quadratic factor B on W of rank 
at least r and complexity 0(e^ 1 i]^ 217 ) such that \\1a — E(1 j 4|B)||[/3(vk') ^ V- 

Proof. For i — 0, 1, 2 . . . we are going to define a local quadratic factor (Bi ti , B 2)i ) on 
W of codimension at most di, rank at least r, and complexity at most i. To initialise 
the construction, we set Bi t o and B 2t o to be the trivial factor {0, W} on W. Suppose 
we have completed this construction up to and including step i. Consider an atom W 
of Bi t i, thus W is a subspace of codimension at most di. Let us say that such an atom 
W is regular if ||1^ — E^^Bj) H^w) ^ r\. If the union of the regular atoms of B\,i has 
density less than 1 — e/2 in W then we continue to step (i + 1); otherwise we stop. 

If an atom W of B\^ is not regular, then by Corollary I4.9l we may find a local quadratic 
factor {Bi t i t w'i #2,i,w) 011 W' of codimension 0(r]~ 216 ) and complexity at most 1 (with 
no bound on the rank at present), such that 

||E(1 A - E{l A \B 2!i )\B 2 ,i,w>)\\L*(w) > V 216 - (4-2) 
If W is regular, we set B^w' to be the trivial factor {0, W'} on W. 
For j = 1,2, let B'^ be the factor generated by B^ and each of the Bj^w' as W 
varies over the atoms of B\ t i, thus the restriction of B'^ to each atom W of B\^ is 
simply Bj,i [w' V&j,i,Wi- Then {B' lit , B' 2i ) is a local quadratic factor of codimension at 
most di + 0{7]" 2lG ) and complexity at most i + 1, which is a refinement of (Bi ; i,B 2; i). 
The rank properties of the original local quadratic factor B 2)i ) have been destroyed 
by the passage to the extension (B[ f , i3 2 J, but we can recover the rank property using 
Lemma 14.71 Namely, if W" is an atom of B[ i? then by applying Lemma 14.71 to the 
quadratic factor B' 2i [w", we may find a local quadratic factor (B" iW ,,,B 2iW „) on W" 
of codimension at most (z + l)r + (z + l) 2 + 2 + 1, rank at least r, and complexity 
at most 2 + 1, with B 2iW „ refining B' 2i [w"- Gluing together the {B1 iW „,B 2iW „) as 
W" varies among the atoms of B[ i( we obtain a local quadratic factor (Bij+i, B 2j i+i) of 
codimension at most d i+ \ := di + 0(?7~ 216 ) + (i + l)r + (i + l) 2 + i + 1, complexity at 
most i + 1, and rank at least r, which refines (i3' x i; i3 2 J and hence (£>■, B[). 
From f)4.2p and Lema 14.61 we have 

||E(U - E(l j4 |S 2il )|i3 2im )|| 2 :2{vl/ , ) > rf 17 

for each irregular atom W of B\^. For regular atoms W we use the trivial lower bound 
of 0. Averaging over all atoms W we conclude that 

\\E{1 A - E(l A \B^i)\B w )\\ 2 LHw) > erf 17 . 
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By Pythagoras' theorem, the left-hand side can be rewritten as 1^(1^1 $2,1+1) ll^^) — 
^(l^l^^) Hiaranj an d so we have the energy increment 

||E(U|5 2 , i+ i)||| 2w > \\E(l A \B 2tl )\\h { w) + cev 217 

for some constant c = Cp > 0. On the other hand, the energy ^(l^l^i) llla^y) clearly 
can only take values between and 1, and therefore this iteration can only occur at 
most 0(s~ l ri~ 217 ) times. At each stage of the iteration, the complexity of the factor 
increases by at most one, and the codimension increases by at most 

0(v~ 2lb ) + (i + !) r + (i + !) 2 + i + 1 < e~ 2 rf 2l% r 

since i = 0(e^ 1 r]^ 217 ). At the end of this iteration, we obtain a final local quadratic fac- 
tor (Bx,i, &2,i) of codimension 0(e~ 3 r]~ 219 r), rank at least r, and complexity 0(e^ 1 r]^ 217 ), 
with the property 

\\l A -E(l A \B 2 ,i)\\uHW>) <V (4-3) 
for all atoms W of outside of an exceptional set of atoms whose union has density 
at most e/2 in W. 

As before, we call an atom W of B\^ regular if (14. 3 j) holds. We wish to find a regular 
value W of for which, in addition, the density of A is at least a — e. Suppose this 
is not possible. Then we have 

— a) < —e 

for all regular W, while for irregular W we have the trivial upper bound of 1. Averaging 
in j, we conclude that 

E W {1 A -a) < -e{l - e/2) + e/2 < 0. 

But the left-hand side is zero by definition of a, a contradiction, and the claim follows. 

If we now set W to be a regular atom of B\^ on which A has density at least a — e, 
and B to be the restriction of B 2) i to W, we obtain the conclusion of Theorem 14.101 □ 



5. High-rank quadratic factors 

We turn now to a more detailed study of quadratic factors of high rank, showing how 
to control the size of atoms in these factors, and later how to count 4-term arithmetic 
progressions in functions measurable with respect to one of these factors. 

Suppose that W is a linear space, and that <px, . . . , <pa '■ W — Y F are quadratic maps. 
Let B = ...,0 d be the quadratic factor defined by the (pi, that is to say the partition of 
W in which the atoms are sets of the form {x : <fii(x) = ci, . . . , <j>d(x) = q}. Throughout 
this section we will assume that B has rank at least r, which means that the homogeneous 
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parts 0d satisfy the rank separation condition rank(Ai0i + ••• + \d<t>d) ^ t 

whenever Ai, . . . , A^ G F are not all zero. 

An important role will be played by the map $ : W — > F d defined by <&(x) = 
(4>i(x), . . . , 4>d{x)). Note that an atom of B is simply the inverse image, in W, of some 
point in F d under this map $. If / : W — > C is a bounded immeasurable function then 
we write f : F d — > C for the function which satisfies f(x) = f (<&(#)) for all x G W. 

Suppose that F = F p , which we identify with Z/pZ. Write e F : F — > C x for the 
standard character on F, which maps x to e(x/p) where e(t) := e 2mt . Our first lemma 
is a standard Gauss sum estimate. 

Lemma 5.1. Suppose that W is an affine space and that (f> : W — >■ F is a quadratic 
form with rankr. Then \E xeW eF{(j>(x))\ = \F\~ r ^ 2 . 

Proof. By translating if necessary (which does not affect the rank) we may identify W 
with F n . Suppose that 4>(x) = x T Mx + r T + c with M symmetric. 
Squaring and changing variables, we have 

\E xeFn e F ((j)(x))\ 2 = \E xA e(<j)(x + h) - <j>{x))\ = \E xA e F (2h T Mx)\. 

If Mx 7^ then the expectation over h vanishes. If Mx = 0, which happens for |F| n_r 
values of x, then it equals 1. Therefore |E 2 . e ^nei?(0(a;))| 2 = |-F| _r , which is the stated 
result. □ 

Using this lemma we can show that the atoms in a high-rank quadratic factor have 
roughly the same size. We phrase this as a result about averaging functions, as follows. 

Lemma 5.2. Let B be a quadratic factor of complexity d on an affine space W , with 
rank at least r. Let $ be the corresponding map from W to F d . Let f : W —> C be a 
bounded B -measurable function, and let f be the corresponding function on F d . Then 
\E w (f)-E Fd (i)\^\F\^ r)/2 . 

Proof. We employ a Fourier expansion on F d . The dual of F d may be identified with 
F d itself by associating to £ G F d the character x (->■ e F (£ • x). Thus we define the 
Fourier transform 

f(0 :=E xeFd i(x)e F (-£-x). 
By the inversion formula we have 

/(i)=f($(i)) = x;fw«-^))- 

§eF d 

Since f(0) = E F d({), we conclude that 

E w (f)-E Fd (f)= m®xewe F (Z-*(x)). 

H£F<t\{0} 
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Now from the rank hypotheses we see that £ • <f>(x) is a quadratic phase of rank at least 
r whenever £, £ F d \ {0}. Therefore, the expectation has magnitude at most \F\~ r ' 2 by 
Lemma [5. II Thus by the triangle inequality we have 

\E w (f)-E Fd (f)\^\F\-^J2^\- 

£<= F d 

By Cauchy-Schwarz and Plancherel we have 

E ^ \F\ d/2 m\L HF ^ 

and the claim now follows from the boundedness of f . □ 

We turn now to the somewhat more complicated task of counting 4-term arithmetic 
progressions using the configuration space. It is easy to see that, for any x G W and 
h e W we have the relation - 3$(x + h) + 3$(x + 2h) - $(x + 3/i) = 0. It turns out 
that if the rank r is sufficiently large then this is in some sense the "only" constraint 
on the points $(x + ih), and furthermore there is a certain uniform distribution among 
all the values of $(x + ih) obeying this constraint. This leads to the heuristic formula 

3 

T\v(f ) ~ ^>x ,x 1 ,X2,x 3 £F d :xo-3x 1 +3x2-X3=0 f { x i)i 

8=0 

which can be rearranged using the Fourier transform as 

Tw (f) « £ if(orif(30i 2 . 

The next lemma constitutes the rigorous version of the above heuristics. 

Lemma 5.3. Let B be a quadratic factor of complexity d on an afftne space W , with 
rank at least r. Let $ be the corresponding map from W to F d , and let f(x) = f (<&(x)) 
be a bounded B-measurable function. Then we have 

\T w (f)-j2\Ho\ 2 \Hm\ 2 \<\F\ {4d ~ r)/2 - 

£<=F d 

Proof. Once again we use the Fourier expansion 

/(*)= £f(0e(£- <&(*)) 

to obtain 

3 

T w (f)= ™(£o,£i,6,£3)n ? te) ( 5 - x ) 
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where 

3 

m(6,6,6,6) : = E , e (52ti-*{x + ih)). (5.2) 
Write £ G (F d ) 4 for the set of all 4-tuples (6,6,6,6) sucn th at 

36 = -6 = 6 = -36- (5.3) 
We will shortly show that, for all choices of the 6 

M6,6,6,6)-is(6,6,6,6)l < \ F \~ r/2 - ( 5 - 4 ) 

Assuming this, we can compare 05. ip with 

E If(0l 2 |f(30l 2 = E ^(^0,6, 6, 6) n/te), 

?6Fd 6,6,C3,«46^ i=0 

obtaining 

\T W (f) - E if(0i 2 if(30i 2 i < i*r r/2 E n to 

£6F<* &,6,6,66Fd i= Q 

Applying Cauchy-Schwarz and Plancherel as in the proof of the preceding lemma, we 
can bound this by |_F|( 4d_r )/ 2 as desired. 

It remains to prove 05.4|> . If (6, 6, 6, 6) ^ then this is trivial, since m(6, 6, 6, 6) 
I in this case. Suppose, then, that we do not have (15. 3p . Then (by a simple inspection) 
we can find i' G {0, 1,2,3} such that J^ =0 (i — ^ 0. We can use the change of 
variables x = y — i'h to write 

3 

mfa, 6, 6, 6) = \ew,hew e (Yl & " ®( y + % '^- 

It then follows from the rank condition that the phase Y^=o ' ®(y + — *')^) contains 
a non-trivial quadratic component in h of rank at least r. By averaging over h and 
applying Lemma I5~TI we see that m(6,6,6,6) has magnitude at most |F| -r / 2 . This 
concludes the proof of (15.41) and hence of the lemma. □ 

We now take advantage of the pleasant positivity properties of the sum 

E if(oi 2 if(soi 2 

appearing in the preceding lemma to conclude the following lower bound. 



Corollary 5.4. Let W be an affine space, and suppose that B is a quadratic factor on 
W with complexity at most d and rank r ^ 10<i. Suppose that A C W is a set of density 
at least a. Then T W (E(1 A \B)) ^ a 4 - 0(\F\' 3d ). 
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Proof. Write / := E(1^|B) for notational brevity. Let $ and f be as before: recall 
that = ((pi(x), . . . , 4>d(%)), where the 0j are the quadratics defining B and that f 
is the unique immeasurable function such that f($(x)) = f(x). Applying Lemma [5.31 
and noting that \F\ { - 4d ~ r ^ 2 ^ | ^ | — 3rf^ we 

T W (f)> ^|f(oi 2 if(3oi 2 -i^r M 

In particular, discarding all the terms with £ ^ 0, we have 

T w (f)>\f(0)\ 4 -\F\- 3d . 
Meanwhile, since / has mean at least a, we see from Lemma [5.21 that 

|f(0)| > a- \F\- 3d 

(say). The claim follows. □ 

We can now prove Theorem 11.11 Let a, e, A be as in that theorem. We will weaken 
the conclusion of Theorem II. II by replacing e with 0(e); clearly, the original statement 
of the theorem can then be recovered by modifying e by a multiplicative constant. Thus, 
our objective is now to find an affine subspace W of F n of codimension 0(e~ 22 °) such 
that T\yi(1a) ^ a 4 — 0(e). We may assume that e ^ a 4 , as the claim is trivial otherwise. 

Set rj := e, d := [CoE^ 1 ^ 217 \ = 0(e~ 218 ), and r := lOd for some sufficiently large 
constant C > depending only on F. By Theorem 14. 101 we ma Y find a subspace W of 
codimension 0(e~ 22 °) and a quadratic factor B on W of rank at least r and complexity 
at most d such that A has density at least a — e on W, and such that 

\\l A -E{l A \B)\\ uHw/) <£. 

By Lemma 13.21 it follows that 

7V,(1 A ) >7V/(E(l x |B))-0(e). 
On the other hand, from Corollary 15.41 one has 

T W ,(E(1 A \B)) >{a- e) A - 0(|F|" M ) = a A - 0(e) - 0(\F\~ 3d ). 
By choice of d, we certainly have 0(\F\~ 3d ) = 0(e), and Theorem II. II follows. 

Appendix A. Erratum to previous paper 

In this appendix we describe the error in our previous paper [9]. 

Fix some finite field F of characteristic greater than 3, for example F = F5. The main 
result [£l Theorem 1.1] of the aforementioned paper was a claimed proof of a statement 
of the same type as Corollary 11.21 if W is an affine space over F and if A C W has 
density at least n~ c , then A contains four distinct elements in arithmetic progression. 
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The attempted proof went via the so-called density increment strategy: supposing that 
A has density a and contains no 4-term progression, we located a reasonably large affine 
subspace W ^ W on which the density of A is appreciably larger than a. Iteration of 
this statement led to a contradiction. 

This density increment was found in two steps. First of all the characteristic function 
1a was approximated in the Gowers L/ 3 -norm by a "quadratically structured" function 
E(1x|jB), where B is a quadratic factor: a partition of the underlying space W into 
atoms defined by a collection of linear and quadratic phases. The relevant statement 
here is [HI Theorem 6.6] (a type of Koopman-Von Neumann theorem). 

Secondly, we studied the number of 4-term progressions weighted by a quadratically 
structured function such as E(1^|jB). A precise statement is [HI Theorem 8.5]. This 
eventually led to the conclusion that A has increased density on some atom of B, which 
we then decomposed into affine linear pieces to get the desired density increment. 

This second phase required B to be high-rank, which means that the quadratic phases 
defining B satisfy a rank separation condition ([HI Definition 8.2], and see also Definition 
14.41 of the present paper). However, the factor B output by the Koopman-von Neumann 
theorem need not be high-rank. To get around this issue we stated and proved a lemma, 
[HI Lemma 8.7], allowing one to refine an arbitrary quadratic factor B to a high-rank 
factor B' . 

The problem with this is that, whilst E(l^|£>) approximates 1^ in the [7 3 -norm, the 
same need not be tru<0 of E(U|F). What is needed is a Koopman-von Neumann 
theorem in which the output factor B is already high-rank. A result of this type is the 
main new development in this paper, specifically Theorem 14.101 Unfortunately we were 
only able to achieve this with usable bounds after first passing to a (large) subspace 
W ^ W. We proceed using an energy-increment argument of basically the same type 
as that usually used to prove Koopman-von Neumann theorems, but with an additional 
rank-refinement step at each increment. 

We remark that somewhat similar issues, albeit in a rather different language, are 
encountered (and correctly addressed) in [5|. See in particular Theorem 5.7 there. In 
their application they cannot afford to pass to a subspace, and this is why their main 
theorem requires bounds of double-exponential type. 



As written in [S] , this issue manifests itself in a slightly different way, namely in the last line of the 
paper when Theorem 8.8 is invoked in an attempt to prove Theorem 4.1. Unfortunately, Theorem 8.8 
is applied to a function g = E(/|Ba) rather than to / itself, and a density increment on g on a subspace 
does not necessarily imply a corresponding density increment on /, because these subspaces do not 
come from partitioning an atom of £>2, but rather from partitioning an atom from a finer factor B 1 . 
The obvious fix for this is to replace g by E(/|B'), but this runs into the difficulty mentioned in the 
main text. 
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